WAVES: Big Data Platform for Real-time RDF Stream Processing
نویسندگان
چکیده
Processing data as they arrive has recently gained momentum to mine continuous, high-volume and unbounded sequence of data streams. Due to the heterogeneity and the multi-modality of this data, RDF is widely used to provide a unified metadata layer in streaming context. In response to this ever-increasing demand, a number of systems and languages were produced, aiming at RDF stream processing (RSP). However, most of them adopt a centralized execution approach which puts a barrier to ensure correct behavior and high scalability under certain circumstances such as concurrent queries and increasing input load. Only few systems sought to distribute processing, but their implementation is still in its infancy. None of them provide a full-fledged and production-ready RSP engine that is easy-to-use, supports all SPARQL 1.1 operators and adapted to industrial needs. As a solution, we present a distributed, fault-tolerant and scalable RSP system that exploits the Apache Storm framework.
منابع مشابه
Design and Test of the Real-time Text mining dashboard for Twitter
One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...
متن کاملTowards a distributed, scalable and real-time RDF Stream Processing engine
Due to the growing need to timely process and derive valuable information and knowledge from data produced in the Semantic Web, RDF stream processing (RSP) has emerged as an important research domain. Of course, modern RSP have to address the volume and velocity characteristics encountered in the Big Data era. This comes at the price of designing high throughput, low latency, fault tolerant, hi...
متن کاملLeveraging Reconfigurable Computing in Distributed Real-time Computation Systems
The community of Big Data processing typically performs realtime computations on data streams with distributed systems such as the Apache Storm. Such systems offer substantial parallelism; however, the communication overhead among nodes for the distribution of the workload places an upper limit to the exploitable parallelism. The contribution of the present work is the integration of a reconfig...
متن کاملApplying Security to a Big Stream Cloud Architecture for the Internet of Things
The Internet of Things (IoT) is expected to interconnect billions (around 50 by 2020) of heterogeneous sensor/actuator-equipped devices denoted as “Smart Objects” (SOs), characterized by constrained resources in terms of memory, processing, and communication reliability. Several IoT applications have real-time and low-latency requirements and must rely on architectures specifically designed to ...
متن کاملChapter 1 . Key Technologies for Big Data Stream Computing
1.1 Introduction Big data computing is a new trend for future computing with the quantity of data growing and the speed of data increasing. In general, there are two main mechanisms for big data computing, i.e., big data stream computing and big data batch computing. Big data stream computing is a model of straight through computing, such as Storm [1] and S4 [2] which do for stream computing wh...
متن کامل